The majority wins: a method for combining speaker diarization systems
نویسندگان
چکیده
In this paper we present a method for combining multiple di arization systems into one single system by applying a majority voting scheme. The voting scheme selects the best segmen tation purely on basis of the output of each system. On our development set of NIST Rich Transcription evaluation meet ings the voting method improves our system on all evaluation conditions. For the single distant microphone condition, DER performance improved by 7.8% (relative) compared to the best input system. For the multiple distant microphone condition the improvement is 3.6%.
منابع مشابه
Using Weighted Oriented Optical Flow Histograms for Multimodal Speaker Diarization
Speaker diarization currently focuses on using audio features to partition an audio stream into speaker homogeneous speech regions, in other words to determine “who spoke when”. Recent speaker diarization corpora contains video recordings in addition to the commonly used audio. Thus, we investigated the benefits of incorporating video features, namely histograms of weighted oriented optical flo...
متن کاملSpeaker diarization of spontaneous meeting room conversations
Speaker diarization is the task of identifying “who spoke when” in an audio stream containing multiple speakers. This is an unsupervised task as there is no a priori information about the speakers. Diagnostical studies on state-of-the-art diarization systems have isolated three main issues with the systems; overlapping speech, effects of background noise and speech/nonspeech detection errors on...
متن کاملA study of new approaches to speaker diarization
This paper reports on work carried out at the 2008 JHU Summer Workshop examining new approaches to speaker diarization. Four different systems were developed and experiments were conducted using summed-channel telephone data from the 2008 NIST SRE. The systems are a baseline agglomerative clustering system, a new Variational Bayes system using eigenvoice speaker models, a streaming system using...
متن کاملELISA nist RT03 broadcast news speaker diarization experiments
This paper presents the ELISA consortium activities in automatic speaker diarization (also known as speaker segmentation) during the NIST Rich Transcription (RT) 2003 evaluation. The experiments were achieved on real broadcast news data (HUB4), in the framework of the ELISA consortium. The paper firstly shows the interest of segmentation in acoustic macro classes (like gender or bandwidth) as a...
متن کاملWhere did I go wrong?: Identifying troublesome segments for speaker diarization systems
The focus of this work is to identify types of segments that are difficult for speaker diarization systems. The diarization outputs of five state-of-the-art systems are analyzed on short/long segments as well as segments surrounding speaker changepoints. We found that for all five systems as the duration of the segment decreased the diarization error rate (DER) increased. Also, segments immedia...
متن کامل